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ABSTRACT: As mixed-reality (XR) technology becomes more available, virtually simulated training scenarios 
have shown great potential in enhancing training effectiveness. Realistic virtual representation plays a crucial role 
in creating immersive experiences that closely mimic real-world scenarios. With reference to previous 
methodological developments in the creation of information-rich digital reconstructions, this paper proposes a 
framework encompassing key components of the 3D scanning pipeline. While 3D scanning techniques have 
advanced significantly, several challenges persist in the field. These challenges include data acquisition, noise 
reduction, mesh and texture optimisation, and separation of components for independent interaction. These 
complexities necessitate the search for an optimised framework that addresses these challenges and provides 
practical solutions for creating realistic virtual representations in immersive training environments. The following 
exploration acknowledges and addresses challenges presented by the photogrammetry and laser-scanning pipeline, 
seeking to prepare scanned assets for real-time virtual simulation in a games-engine. This methodology employs 
both a camera and handheld laser-scanner for accurate data acquisition. Reality Capture is used to combine the 
geometric data and surface detail of the equipment. To clean the scanned asset, Blender is used for mesh retopology 
and reprojection of scanned textures, and attention given to correct lighting details and normal mapping, thus 
preparing the equipment to be interacted with by Virtual Reality (VR) users within Unreal Engine. By combining 
these elements, the proposed framework enables realistic representation of industrial equipment for the creation 
of training scenarios that closely resemble real-world contexts. 


KEYWORDS: Digital twin; 3D reconstruction; Virtual reality; Laser scanning; Photogrammetry, Training 
simulation; Unreal Engine. 


1. INTRODUCTION 


In recent years, the increased availability of mixed-reality (XR) technology has spurred the exploration of virtual 
reality training environments, which showcase their immense potential in enhancing training effectiveness across 
various domains(Abulrub et al., 2011). By reducing expenditure associated with travel and physical resources, 
safety training that has been delivered via virtual methods is predominantly more cost-effective than non-virtual 
alternatives, without sacrificing training effectiveness (Adami et al., 2021) (Stefan et al., 2023). 


Virtual Reality (VR) can present us with realistic replications of real-world situations with a high degree of 
accuracy, and immersive virtualised training scenarios can significantly improve participant engagement when 
compared to equivalent training using conventional methods (Sacks et al., 2013). Trainees presented with a virtual 
environment can engage with high-risk scenarios without actual danger. The elimination of risk fosters confidence 
and risk-free experimentation, which has a significant positive impact upon post-training technical proficiency 
(White & Jung, 2022). Regarding the attitude of trainees towards professional learning content, Loosemore and 
Malouf (Loosemore & Malouf, 2019) suggest that there is “a need to adapt safety training to create more emotional 
connection” between the trainees and their learning within the construction industry, and that “New technologies 
such as virtual reality may be useful this context since through [life-like] immersion in the work environment and 
simulation of workplace accidents, they are able to create a stronger emotional connection with the subject matter.” 
This suggestion is supported by Newton, Wang and Lowe (Newton et al., 2015) who find that “incongruously, 
results indicate that user’s reporting their experience of virtual reality score that experience higher in presence 
terms than users experiencing the physical world,” indicating that virtual experiences may be more emotionally 
engaging and more impactful for trainees than real-world experiences alone. This calls us to re-examine our 
approach to training and education as we begin to see XR technology as an effective tool to enable trainees to 
connect theoretical knowledge and practical application. 


The standard of these simulations is influenced by the quality of virtual representation. High-fidelity 3D illusions 
bridge the gap between physical and digital environments and enhance the task-oriented performance of the 
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trainees (Slater, 2009) and so highly-realistic virtual assets may improve the effectiveness of the virtual experience. 
1.1 3D Scanning Methodologies 


To elevate the authenticity and realism of virtual training, exploring 3D scanning methodologies (such as 
photogrammetry and laser scanning) present exciting possibilities as potential solutions for highly realistic 
representation within VR scenarios. By employing advanced 3D scanning technologies, we can capture with 
accuracy the dimensions and intricate surface details of real-world equipment and environments. After sufficient 
data has been captured with scanning hardware, the data will be manipulated through a pipeline of various 
specialized 3D modelling software to create a mesh that may be rendered by a games engine. 


There are practical challenges associated with the application of 3D scanning techniques which must be addressed, 
such as site-access for data acquisition, followed by noise reduction and asset optimization. To conduct the training, 
the user will be expected to manipulate the asset, or parts of the asset, using virtual reality hardware. Therefore, 
not just aesthetic accuracies, but realistic interaction and functionality will also be essential. Equipment which has 
independently moving components will have to be separated into dynamic and static bodies to facilitate 
independent movement and interaction within the virtual environment. 


1.2 Goals of this Article 


Our effort to establish a framework that adheres to industry best practices has been in collaboration with The 
Faraday Centre, recognised for their expertise in electrical engineering training. Ordinarily, The Faraday Centre 
delivers training using out-of-service switchgear that has been refurbished or donated to the Centre, so that trainees 
can receive hands-on practical training with switchgear up to 33kV. A significant challenge presented by electrical 
engineering equipment is that there are high costs associated with the newer, higher-voltage switchgear, thus 
making their acquisition impractical. A virtual training environment (VTE) offers a cost-effective alternative to 
simulate operation of this high voltage equipment for training purposes. Our data-driven approach hopes to ensure 
that the virtual representations closely mirror their physical counterparts. 


Therefore, we believe that establishing a framework encourages the integration of virtual technologies for 
industrial training scenarios. Our objective is to provide insights into the scanning methodologies, challenges faced, 
and available solutions in capturing the details of real-world environments, equipment, or other assets. To achieve 
this, this paper will review the current technology and methodologies used to emulate real-world equipment and 
their processes within a virtual context. Drawing inspiration from methodologies employed in data-driven digital 
twinning pipelines (Pan et al., 2022), both photogrammetry and laser-scanning applications are integrated within 
this framework and their compatibility with the development of contemporary professional training for high-risk 
environments is discussed. The framework proposed is capable of systematically addressing each obstacle, thereby 
ensuring a seamless transition from physical equipment to the creation of highly realistic virtual training 
environments. 


This paper is organized as follows: section 2 will look review production pipelines, methods and motives for the 
creation of such data-driven virtual assets. Section 3 presents an overview of the technology required to scan a 3D 
object and recreate it as 3D virtual asset. Section 4 will report the framework we have developed as a solution to 
the challenges presented when developing realistic VR-ready assets from high-voltage switchgear scan-data. 
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2. METHODS FOR REALISTIC VIRTUAL REPRESENTATION 


Virtual representation encompasses the creation of digital reconstructions of real-world subjects, including those 
with glossy surfaces like switchgear equipment. Specular (mirror-like) reflections can challenge 3D data capture 
methods like laser scanning and photogrammetry, reducing the usefulness of output models (Frost et al., 2023), 
therefore we will review approaches designed to address issues associated with capturing accurate data. 


Another challenge involves minimizing the computational power required for rendering our results in a real-time 
application. Two viewpoints (one for each eye) must be rendered, making VR susceptible to difficulties with 
framerate, which will be affected negatively by superfluous model complexity. Therefore, our review will be 
extended to provide an overview of various methods to clean and simplify our results. 


2.1 Photogrammetry 


Photogrammetry is a 3D surveying and modelling method which has the major advantage of being low-cost, 
portable, flexible and is capable of delivering highly detailed reconstructions. Three-dimensional information 
about objects or environments is obtained by analyzing a dataset of two-dimensional photographs. 


Photogrammetry relies on the identification of feature points on or within the object being scanned. Areas of the 
subject with aspects like colour variation, surface imperfections, or details such as dust and grime must be 
adequately captured to be reconstructed. Significant overlap across multiple images in the dataset is crucial to 
ensure an ample supply of contrasting, unique points. Observed similarities across images is used to reinforce the 
confidence of the photogrammetry software in determining the 3D positions of each point. Available 
photogrammetry software options are discussed in Section 3. 


Retopology: 
Photogrammetry reconstruction: 


Shoot pictures of the subject: 


The initial phase involves capturing 
multiple images of the subject 
from various angles and positions. 
These images serve as the raw data 
for the subsequent steps. 


This stage involves using 
specialized software to process the 
captured images. Photogrammetry 

algorithms analyze the images, 
detect feature points, and create a 
3D point cloud or depth map of the 
subject. 


After generating the 3D model, it 
often requires creating a more 
efficient mesh to optimise 
simulation performance. 
Additionally, UV mapping is 
performed to prepare the model 
for texturing. Any holes or 
imperfections in the mesh are 
addressed. 


nT 


Reprojection: 


Once the model is optimized, the 
next step is to reproject the high- 
resolution textures or color 

4> information from the original 
images onto the optimized 3D 
model. This ensures that the final 
asset retains the visual details 
captured during shooting. 


Post-clean: 


This step involves refining the 
model and texture. It includes 
cleaning up any artifacts or 
anomalies in the mesh and 
textures, adjusting lighting and 
shading, and fixing any incomplete 
parts of the model or texture. 


Export: 


The final phase includes baking 
necessary texture maps (such as 
normal maps, ambient occlusion 
maps, and specular maps) and 
exporting the asset in a format 
suitable for integration into a game 
engine. Again, decimation may be 
performed to optimize the asset's 
polygon count for real-time 
rendering. 


Fig. 1: A Photogrammetry process diagram showing an overview of the various stages from data capturing to 


a simulation-ready asset 
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SECTION A - EXTENDED REALITY TECHNOLOGIES IN CONSTRUCTION 


The accuracy of camera alignment and the quality of the created asset is determined by consistency across the data 
obtained from the input images. In cases where the object's surface lacks distinctive features, challenges arise in 
achieving accurate surface reconstruction. According to Schiach, the objects best suited for automated image- 
based 3D reconstruction methods feature amorphous geometries, structured surfaces, numerous edges, and exhibit 
inhomogeneous colouring. Objects that yield poor or no results typically have monochrome, translucent, reflective, 
or self-resembling surfaces (Schaich & Fritsch, 2013). Dark materials, insufficient lighting, and changes in lighting 
can all have detrimental effects on the image quality and may prevent the photograph from registering as correctly 
aligned. Methods we may employ to optimise the conditions in which we capture data include strategic distribution 
of light sources to eliminate shadows, applying a coat of spray to make the surface more responsive to scanning, 
cross polarisation techniques, or by using some combination of these methods (Noya et al., 2015; Porter et al., 
2016). 


2.1.1 Capture methods 


To capture a static object, the photographer moves around the subject, taking multiple pictures from various 
viewing angles. Collecting every angle may be made difficult if the object is quite large and/or positioned 
inconveniently for photo-scanning purposes, meaning a complete scan may be impossible without repositioning 
the object. For the feature detection algorithms to run correctly, the features of the input images must remain 
consistent. Therefore, if we wish to reposition the object, we must take the additional step of separating desirable 
features of our subject from undesirable inconsistencies from background visual information. Typically, this 
involves manually applying masks to each input image, a potentially time-consuming process (Farella et al., 2022), 
even with expediating background removal features like semantic segregation (Chen et al., 2017; Kang & An, 
2021; Ronneberger et al., 2015). 


Alternatively, a camera configuration with strategic lighting can be set up to automate the masking process. 
Background interference may be avoided by ensuring the scanned object is well-lit against a dark, featureless 
background. This allows for the target to be rotated and repositioned in front of a camera which may remain fixed, 
providing sufficient captured data from various viewing angles, without the feature detection algorithms being 
disrupted by undesirable information. The effect of this method may be improved by strengthening the lighting of 
the foreground to heighten the contrast between the foreground and background. This lighting can be provided in 
different ways, the object may be homogenously lit with LEDs from various angles, or a piece of equipment such 
as a ring light may be employed; both may sufficiently eliminate shadows. 


Data Collection Reality Capture Instant Meshes 


Manually stitch Decimate mesh 
scans, synthesising to reduce polygon 
amesh count 


Dorsal photos 
Multi-camera array 


Ventral scan 
reconstruction 
Scan 2 
Additional 
individual photos Scan alignment 
Handheld camera 


Draw seams to 
guide retopology 


for clean-edge UVs 
Dorsal scan 


reconstruction 


Scan 1 
Ventral 


(underneath) Create a new mesh Recover high- 


photos 7 z 
. using the aligned frequency 
Multi-camera array scans as a template geometry details 


Fig 2: A flowchart describing the process used to create a clean asset from a photogrammetry reconstruction 
using a multi-camera array to capture a turtle (Bot et al., 2019). The software used is included. Recovering 
high-frequencies geometry details will be expounded upon in Section 2.3. 


To capture dynamic objects, a single camera is unsuitable as it presents a high risk of capturing inconsistent data 
due to movement of the subject. Therefore, a multicamera array is used, which typically consists of 4 to 30 cameras 
on tripods or metal rods, with all of them pointing towards a central area. This “rig” of specially calibrated lights 
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and cameras permits efficient and simultaneous data capture from various angles to ensure consistency across 
source images. An alternative method is the use of synchronized video with a common motion (e.g., a clapper or 
a ball drop in view of all cameras). Figure 2 shows the methodology employed by (Bot et al., 2019) when using a 
multi-camera array to scan and create and asset that captures the likeness of a turtle. 


2.1.2 Cross polarisation and reflectance acquisition 


VR is capable of simulating realistic lighting and accurate material properties. Reflectance acquisition techniques 
are used to measure an object's reflectance properties under varying lighting conditions. One such approach using 
polarisation techniques is outlined by figure 3, below. Numerous images are taken with different lighting 
conditions to sample the appearance of specular highlights under a dense sampling of lighting directions, which 
can be data-intensive and time-consuming, particularly when dealing with highly specular surfaces. 


Set up dark background, polarised light source and subject. Calibrate camera with polarisation filter to minimise reflections 


$ 


Adjust subject for 
differeny viewing > 
angle 


Adjust the camera's 
polarisation filter, 
rotating it 90° 


Repeat until capture 
is sufficient 


Capture parallel- 
polarised image 


Capture cross- 
polarised image 


Data Collection 


v 


Create 2 datasets: one of the cross-polarised images and one of the parallel-polarised images. Pair corresponding images 


by renaming the parallel-polarised image to be identical to its cross-polarised counter-part. 


t 


Edit Layer Blend to Subtract so that the 
cross-polarised layer is subtracted from the Merge the layers by 
parallel polarised layer, necessary to flattening the image 


Overlay the 
> correspinding cross- 
polarised image 


Open parallel- 
polarised image 


= calculate the light diffusion of the surface 

G 

ke] 

© 

oS Repeat until each parallel-polarised image Save the file, 

a - : ne Desaturate the 

has been converted into a specular image. It overwriting the A AEN 
z ; A £ = image, making it 
is possible to automate this workflow by parallel-polarised 5 
z 4 : black and white 

creating a Photoshop Action (a macro). image 


Replace input Export 3D h, 
Reconstruct mesh aaa alate Rebuild texture, pS DS 
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= 
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°o Invert the specular FA Set the material of the 3D Within the material settings, set the roughness 
3 texture to create a Q mesh to the cross-polarised equal to the roughness map multiplied by a 
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2 

= 
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Fig. 3 A flowchart showing an overview of the data processing required to prepare what information is 
collected in a cross-polarisation method (Frost et al., 2023) to acquire reflective data, including the software 
employed. 
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Cross polarisation methods produce an image where most of the specular data is removed using two orthogonal 
polarisation filters. One filter is placed on the camera lens, and the other is a polarising film positioned in front of 
the light source to illuminate the target with polarised light. Cross-polarised images are highly effective for 
photogrammetry reconstruction as they minimise disruptions caused by reflections (Frost et al., 2023). 


The polarisation filter on the camera lens can be adjusted to be parallel rather than orthogonal, thus producing a 
corresponding image which preserves specular information. Subtracting the cross-polarised image from the 
parallel-polarised image yields a specular image. Collecting specular images from multiple camera positions 
allows us to create a specular map by replacing the cross-polarised data with the specular data during the 
reconstruction process. This map represents the reflectivity of the object's surface at different locations on the 
mesh. However, achieving this in uncontrolled environments, where ambient lighting is beyond our control or with 
large equipment that requires camera movement, can be challenging and result in inconsistencies. 


2.1.3 Colour correction 


To ensure the accuracy of the model texture, especially for its use in a games engine for simulation, managing 
lighting conditions is crucial. If lighting affects the color of the captured images, a Look-Up table (LUT) can be 
applied to the input images to correct their colour accuracy. Software like Houdini (SideFX, 2022) or Photoshop 
(Adobe, 2022) can generate this LUT from an image of a colour checker taken at the site under the same lighting 
conditions as the photos, and then batch process the input images, correcting colour information. 


Most games engines have their own lighting systems. Depending on the 3D objects being rendered, most 3D games 
engines simulate realistic shadows for objects in relation to in-simulation light sources. These shadows can be 
dynamically calculated at runtime, adjusting with user interactions or object movements. In some cases, shadows 
might be baked into the scene if they are not expected to change. If shadows were captured in the source photos 
due to non-flat lighting during image capture, they could inadvertently become part of the object's texture 
information. To address this, the shadow information should be removed. This can be achieved by opening the 
texture data from the UV maps in software like Photoshop, where adjustments can be made to minimize or 
eliminate the shadows. This process homogenizes and evens out the lighting affecting the texture, allowing the 
games engine's lighting to handle shadows appropriately. 


2.2 LiDAR 


In recent decades, point clouds obtained through light detection and ranging (LIDAR) have become a significant 
data source for various mapping applications within the photogrammetry, remote sensing, and cultural heritage 
communities among many others (Leberl et al., 2010) (Wang et al., 2018). There are two primary LIDAR methods 
to consider, laser scanning and structured light scanning. Both make use of time-of-flight (ToF) calculations, the 
scanner can determine the distance and create a point cloud of the object's surface. Their advantages include their 
noninvasive nature, high precision, and interoperate easily with supporting software. 


Aerial laser scanning (ALS) and Terrestrial laser scanning (TLS) are two examples of long-range scanning methods 
that rely on laser beam emission. The emitted lasers can reflect off of surfaces up to 130 meters away, and can be 
used to scan large objects such as airplanes. The Focus3D S120 (FARO) is a laser scanner employed by (Wang et 
al., 2019) as described in figure 5, so this method may be fit for our purposes, however, long-range can be more 
expensive and may require more time for data processing. 


Structured light scanners project patterns of light (such as grids or stripes) onto the surface of an object. The 
deformation of these patterns on the object's surface is captured by the scanner's cameras. The distortion of the 
patterns is then used to calculate the 3D coordinates of the object's surface points. Cui, Tao and Zhao acknowledge 
that the 3D light-section reconstruction method (depicted in figure 4) is a common and applicable way to obtain 
point cloud data for the needs of 3D reconstruction potential accurate to the millimeter. Structured light scanners 
are generally faster than laser scanners and are well-suited for capturing medium-sized objects with moderate to 
high surface details. 


However, like photogrammetric methods, structured light scanners struggle with reflective, transparent, or 
homogenous surfaces. Their accuracy can vary based on the complexity of the object's surface; for example the 
performance of these scanners suffers when there is a distinct lack of points of interest on the surface, as it makes 
it difficult for the algorithms within the software to accurately track the lasers position frame by frame. 
Consequently, the scanner will “slip,” leading to inaccuracies in scanning surfaces. We may mitigate some of these 
issues by scanning the surface multiple times, or by introducing additional features to aid 3D registration. 
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Fig. 4 A process diagram showing the light-section method for a structured light scan. (Cui et al., 2021) 
2.3 Model Synthesis for Virtual Reality 


To achieve realistic virtual representation, it's crucial to capture high-frequency details. However, this often results 

in high-polygon count 3D models generated by scanning methods, which can slow down real-time simulations, 
especially in virtual reality. Mesh decimation helps reduce the complexity by simplifying the mesh to a target 
polygon count, although some detail is lost in the process. As depicted in figure 2, in cases where the scan data has 
inconsistencies, further reconstruction and cleaning with 3D editing software might be necessary. Alternatively, 
the scan can serve as a reference for creating a new, more accurate mesh. 


High-frequency detail can be restored by generating normal maps from the complex mesh, which are used to 
create detailed shadows and highlights. Unwrapping the mesh's topology into UVs is required to store this data as 
a texture file. Specialised software such as InstantMeshes as mentioned in (Bot et al., 2019) or similarly specific 
tools like those of Houdini (SideFX, 2022) called Sidefx Labs which contains the AutoUV as used in (Triantafyllou 
et al., 2022). After retopologising the mesh, any available texture information can be reprojected. If the captured 
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Fig. 5: A diagram showing an overview of methods being used to reconstruct a detailed environment for VR 
(Wang et al., 2019) 
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texture data is insufficient, libraries like Quixel Megascans provide high-quality textures for approximating the 
surface material. 


One example (Alexander et al., 2009), involves the use of a stereo-camera rig with strategically placed lights. This 
rig is calibrated to capture multiple images simultaneously, each providing various lighting information: cross- 
polarized, parallel-polarized, and spectral line measurements for diffuse albedo, specular albedo, and 3D geometry, 
respectively. To aid data calibration, makeup dots on the target are used, ensuring they don't obstruct data while 
allowing precise realignment. By contrast. (Wang et al., 2019) merge laser scan point clouds using registration 
software. Ground control points (GCPs) and other scene features’ known locations are used to combine scan data, 
creating a comprehensive indoor environment reconstruction. Additional scans made with structured light scanners 
add more detailed information to specific areas of interest for analysis. 


Once the data from these various methods has been combined, the next key challenge lies in effectively separating 
these components to enable interaction within the virtual environment. Advanced VR interactions, characterized 
by direct manipulation, diverse input devices, and high degrees of freedom, demand the division of the unseparated 
scan-data model into distinct, potentially modular components. 3D modeling software will play a pivotal role in 
separating the components for independent simulation of their interactions. 


For training purposes, equipment behaviors will also require virtual recreation. While the best approach is to have 
firsthand expert demonstrations of the equipment, this is often not feasible due to factors like high risks and limited 
accessibility. In such situations, an alternative approach is to attach recording equipment to a professional who can 
perform the necessary operations. This recorded footage can then be used as a reference for replicating the 
equipment's behavior in a virtual environment. 


3. TECHNOLOGY 


A standard asset creation pipeline involving scanning processes will require several pieces of hardware to collect 
data, with the appropriate software to process the information. We will also consider hardware and software 
required to develop functionality and render the equipment as interactable models within a VTE. The most 
effective solutions will be discussed below. 


3.1 Software 


Each step in this process necessitates specific software tools. Initially, images must be prepared for alignment, 
followed by running photogrammetry algorithms to construct textured models from these images. The subsequent 
phase involves processing the data obtained through 3D scanning to create a 3D model that faithfully represents 
the physical geometry of the scanned subject. This model must be optimized for seamless integration into a games 
engine for virtual interaction, and various texturing solutions will be evaluated. It's common to encounter multiple 
software options for each stage of the scanning process. Some software packages bundle applications to be used 
in tandem with diverse workflows, and open-source alternatives may also be available. (see Table 1). For the 
software upcoming to be listed, the minimum processing requirements would be a 2GHz CPU and 16GB or more 
RAM. 


Table 1. Depicts a selection of software available from the Geomagic application suite, and corresponding open- 
source applications 


Geomagic software Description Open-source alternative 


Geomagic Capture Scanner specific registration software 


Geomagic Design X Rebuild CAD data reverse engineered from scans | OpenCAD 


Geomagic Control X Visualising and analysing data for quality control | Volume Graphics 


Geomagic Freeform Manipulate and manage large unstructured MeshLab 
meshes 


3.1.1 3D scanning software for 3D scanners 


To process the results of the scanning process, various specialized software solutions are employed to manage scan 
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data and enhance scenes. For this type of 3D scanning software, it's often bundled with 3D scanning hardware, 
and many developers have created their own software packages to accompany their laser scanners. Faro utilizes 
Faro Scene for scan registration and cleanup of collected geometry data, whilst Faro Zone 3D is used for tasks like 
importing high-res photos, utilizing registration targets, and performing metrics calculations within 3D 
reconstructions (Wang et al., 2019). Creaform's VX Elements is used to calibrate data collected from structured 
light scanners, while VX Model serves for more detailed scene modeling or measurements. Artec offers a 
comprehensive set of tools within Artec Studio, tailored for scan reconstruction and will be suitable for processing 
scan data. Additionally, CloudCompare (Open source) is an open-source solution that allows us to compare and 
edit point clouds or meshes (Dewez et al., 2016). It provides the capability to transform scan data to ensure 
alignment with our photogrammetry reconstructions. 


3.1.2 Photogrammetry software 


RealityCapture is renowned as one of the top choices for photogrammetric reconstruction for speed, accuracy, and 
format compatibility. Due to its exceptional capabilities, it is available at a premium price point. Other popular 
premium software includes Metashape (Agisoft) and Recap Pro (Autodesk). 


There are many free photogrammetry software, the most popular of which includes Meshroom (AliceVision) 
which has been integrated as a free plug-in for 3D processing software such as Houdini (SideFX) and Maya 
(Autodesk). Other open source solutions include 3DF Zephyr, Colmap, and Regard3D. 


3.1.3 3D mesh processing/modelling 


3D mesh processing is a fundamental component of the 3D scanning and modeling pipeline, used to manipulate, 
refine, and optimize the three-dimensional mesh models generated from various data acquisition methods, such as 
laser scanning and photogrammetry. Most have access to various plug-ins which augment and enhance the 
capabilities of the software, unlocking a multitude of functionalities that cater to diverse project requirements. 


Premium solutions include 3DS Max, Maya (both Autodesk), Houdini (SideFX), and ZBrush. Zbrush is well 
known in the professional industry for its many highly advanced tools for tasks like cleaning, healing, and texturing. 
3DS Max offers cloth, light and liquid simulations and its own scripting language (MAXScript). Houdini’s 
procedural modeling solutions may provide scalability of modular components, enhancing the flexibility and 
efficiency of the asset creation and simulation process. 


Blender is a remarkable free and open-source 3D modeling software known for its exceptional versatility. It offers 
a wide spectrum of capabilities, making it a powerful tool for cleaning up scans and repairing meshes. While 
Blender has a learning curve, due to its wide availability, there is a wealth of learning resources online for 
techniques such as hard surface modelling. There are also plug-ins which allow you to create highly detailed 
materials, like Substance Designer (Adobe), or create powerful renders of 3D objects. For tasks like modelling 
switchgear equipment, Blender’s extensive features make it an ideal choice for this purpose. 


Among other open-source solutions are weaker options such as Autodesk TinkerCAD and Vectary. These free tools 
operate directly in your web browser, however, are primarily designed to educate entry-level users. For instance, 
TinkerCAD is often integrated into 3D printing processes and has limitations, such as restricting OBJ uploads to 
models with up to 300,000 faces. 


More open-source options include OpenSCAD, FreeCAD, and Sculptris: OpenSCAD requires a bit of previous 
skill as you have to code your objects and it works with primitive geometric shapes and reads the code to modify 
and render them creating 3D models a with constructive solid geometry (CSG) which can be beneficial when it 
comes to 3D printing your projects. FreeCAD is a 3D modeling software was based on Python language which 
allows you to add new specialized features. Similarly Sculptris modifies pre-existing shapes with brushes of 
different strokes. 


3.1.4 Games Engines 


Lastly, the software we must consider is running the simulation so that it may be viewed and interacted with by a 
VR user. Unreal Engine 5 (Epic Games) natively supports VR development and also has the Quixel Bridge feature, 
giving easy access to tools and resources which may be beneficial or time saving for to the project, saving 
development labour. Similar plug-ins are available for Unity and the open-source Godot Engine. These games 
engines provide the necessary framework for creating immersive and interactive virtual environments based on 
the 3D models and assets generated during the scanning and modeling process. 
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3.2 Hardware 


Hardware plays a significant role in capturing visual data and running the software necessary for asset visualization. 
To achieve accurate photogrammetric reconstruction, the quality of the captured images is essential, motivating us 
to explore several camera options, including the Mattterport Pro, DSLR cameras, and due to their wide availability 
we will also consider mobile phone cameras. For highly accurate metrology of our scanning targets, we shall 
review Lidar and structured light scanners. Lastly, we will address hardware that may be used to provide user 
interaction within a virtual environment, such as head mounted displays (HMDs) and review processing 
requirements. 


3.2.1 Cameras and registration 
Table 2. Comparing the Megapixel value of 

Standard photographic equipment is often more accessible | various available camera devices. 
and cost-effective compared to other 3D scanning methods , ; 
like LiDAR or structured light scanning. The camera will be | Device Megapixels 

t ther i t i t t D 1 fi ‘ : 
used to gather input images to create a 3D model from Mobile iPhone 14 48MP 
photogrammetry with an accompanying texture. When 

oe : i ; Phone Pro Max 
aiming for the greatest accuracy, images with a higher 12MP 
resolution are preferred, therefore, to opt for a camera of 
superior quality is justified. 12MP 
Various cameras may differ in quality, varying in number of iPhone 11 12MP 
pixels, sensor size, and field of view. Many pixels help to 
boost the image resolution to capture fine detail, most 12MP 
noticeable when zoomed in. Different lenses can be used with f 
different DSLRs to correctly calibrate the cameras for iPhone 6 8MP 
scanning purposes. Conversely, smartphones may not have as Sasat 16MP 
many customisable options or similar fine-controls over the 
: ; ; Galaxy Fold 
image capturing process, however as can be inferred from 5G 
table 2, smartphones can often offer sufficiently high-quality 
visual data, as well as being widely available, highly portable Google Pixel | 50MP 
and very accessible. Some smartphones have a single camera, 7 
others have dual sensors, quad sensors, however, frequently, 
high-megapixel cameras being used on market smartphones | DSLR Nikon 24MP 
don’t output photos as high as the camera is capable of | Camera D3300 
because of pixel-binning. 
Cannon EOS | 2.11MP 

Using a camera will be essential to capture texture and colour ID Mark III 
detail, as well as for providing proper reference for 
registration within the 3D processing software. Sony X7R 61MP 


3.2.2 LiDAR Scanners 


Table 3. Illustrating the range in available LiDAR scanners depending on the required range of the scan. 
Manufacturer Short range Medium Range Long Range 

Artec Micro | Space Spider | Eva Lite Eva Leo Ray II 

Faro Gage FaroArm | Freestyle Vantage Focus 

Creaform R-series Go!Scan HandyScan MetraScan | MaxSHOT 3D 

Sick S300 series Tim-S OutdoorScan 3 
Leica BLK 360 RTC 360 Scanstation 


LiDAR scanners are known for their high accuracy and ability to capture intricate details. For the purposes of this 
project, they will be used for capturing complex geometries and surfaces with varying textures. Different scanners 
with different features are better suited to various scanning tasks depending on the object size and the necessary 
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scan quality. Faro are well known for their mid-long-range scanners, and Creaform have also been used for their 
handheld scanners by similar project. Other scanners include the Geomagic capture and capture mini, ideal for 
“desktop scanning” of small objects up to the size of a shoebox, as well as the EinScan product range from 
Shining3D. 


Certain scanners integrate both camera components and structured-light sensors. This grants the scanners the 
ability to gather supplementary colour information, which is particularly valuable for laser position-tracking and 
registration processes. Some artec scanners include cameras, allowing colours that the texture camera has captured 
to the 3D mesh being created. The quality of this texture is sufficient for a majority of metronomic applications. 
The quality depends of the generated geometry depends on the selection of the scanner, on the scanning distance, 
the lighting conditions, and the general execution of the scanning routine. 


3.2.3 Matterport 


Matterport is a company specializing in 3D scanning technology and software to capture and render 3D models of 
physical spaces. Their Matterport Pro Camera utilizes depth-sensing cameras and imaging sensors to create 3D 
point clouds of environments. The Matterport Pro2 3D camera offers 36MP images with a scan accuracy of +/- 
50mm, while the Pro3 improves accuracy to +/- 20mm at a 10m distance. This tripod-mounted device captures 
comprehensive visual data by rotating 360 degrees in a short time. However, there are privacy concerns regarding 
detailed models unintentionally capturing sensitive information. 


Matterport provides an iPad app for camera control, offering a "Dollhouse" view to identify unscanned areas. Users 
can navigate 3D models by selecting points within the model, making it popular for virtual property or office tours. 
They also have a mobile application using LiDAR sensors in phones to scan objects and generate 3D meshes 
in .obj format. While convenient, these scans may lack the precision needed for high-fidelity virtual assets, 
particularly in capturing intricate surface details. 


For this project, Matterport services have drawbacks. They can be costly due to hardware expenses, service charges, 
and the need for additional payment to access the metadata folder (MatterPak). The generated point cloud format 
(.xyz) lacks widespread compatibility, often requiring conversion to more universally accepted formats like .e57. 
Furthermore, Matterport's scanning technology might not provide the required accuracy and detail for the project, 
especially in capturing nuanced surface features necessary for high-fidelity 3D models. 


3.2.4 VR Hardware 


Different head-mounted displays have been designed for slightly different purposes. While most headsets come 
with controllers, not all controllers are the same. Because the head-mounted display is the hardware through which 
the student interfaces with the training environment, the controller will dictate the possible depth of interaction In 
the context of this research, the emphasis is on a cost-effective and immersive VR solution. Many VR headsets 
can run the proposed simulation. However, a mid-range specification HMD with stand-alone capabilities is 
preferred over more powerful and expensive headsets such as the HTC Vive Pro line of HMDs. This choice 
imposes certain technological limitations on the performance of the 3D virtual representation. 


For this project, the target headset will be a Meta Quest 2 VR headset. As well as its performance capabilities, the 
oculus link cable accessory allows the HMD to interface easily with a PC for development and testing purposes. 
The Pico Neo line of HMDs boasts similar specifications as the Meta Quest 2, both headsets have previously been 
used for virtual training and education purposes (Cowie & Alizadeh, 2022; Han et al., 2022; Moolman et al., 2022). 


4. EXEMPLIFYING THE FRAMEWORK: HIGH-VOLTAGE ELECTRICAL 
SWITCHGEAR 


Photogrammetry excels in capturing high-detail visual information, although as mentioned the resulting three- 
dimensional information may be susceptible to gaps, noise and inaccuracies. To use a fixed-camera or a 
multicamera set-up is feasible only for objects compatible with the rig in scale and shape, meaning they are mostly 
applicable only for small-to-medium objects. We shall be capturing objects on the site of their professional 
environment, therefore lighting conditions may not be perfect. Because of this the geometry that will result from 
our photogrammetry effort will likely have inconsistencies and not be very robust. For this reason we shall not 
rely on geometry data obtained this way, however, efforts will be made to retain any worthwhile texture 
information generated by the photoscan. 
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SECTION A - EXTENDED REALITY TECHNOLOGIES IN CONSTRUCTION 


The 3D geometry obtained from LiDAR scan will likely be more robust however due to poor exercise of control 
over the lighting conditions, a complete and consistent scan cannot be guaranteed. As mentioned in section 2, we 
shall seek to mitigate these inconsistencies by employing the light-section method and performing multiple 
overlapping scans. To avoid incongruities caused by erroneous position tracking, a consequence of featureless 
scanning geometry, one solution emerged as notably effective: affixing ping pong balls or golf balls onto the 
surfaces of the equipment using blue tack. This addition of texture intricacies facilitated a more precise registration 
of the scanner's position during the scanning process. As the scanner traversed the modified surfaces, the intricate 
texture details provided the necessary points of reference for the algorithms to accurately determine the scanner's 
movement. Consequently, the scanner's accuracy improved significantly, and the issues of slippage and positional 
loss were effectively mitigated. 


After combining the LiDAR and photogrammetry data into a unified 3D visualization, we suggest employing 
Blender to refine and optimize this asset. In case the resulting asset falls short of the realism required for real-time 
VR, the reconstructed data will serve as a reference template for generating a new mesh. By utilizing the scan data 
as a guide, the precise measurements obtained from the scan data can inform the development of an equally 
accurate 3D object. Furthermore, we have access to suitable replacement textures to maintain our goal of 
photorealism. While this process may demand additional time and effort, it is essential for achieving an immersive 
virtual reality experience. 


LiDAR Scan 


Align the LiDAR model Reproject high-quality 
with the texture information 
Prepare and collect Process eca date — > photogrammetry ——»® onto high-accuracy 
front-end data Artec Studio model geometry information 
Artec Eva Lite Cloud Compare Reality Capture 


Using the scan data as a template 
and employing hard-surface 
Photogram y modelling techniques to create 
individual meshes for each 
independent part of the equipment 


Create 
Gather high-quality Prepare photos, een 
photos calculate LUTs Dooe Y 
DSLR camera Lightroom if 


Reality Capture 


Recover photogrammetry textures 
or acquire sufficient texture 
substitutions from Quixel Bridge. 
Render in VR. 

Unreal Engine 5 


Fig 6:. Shows a process diagram outlining the methodology best suited to meet our needs of reconstructing a 
piece of equipment for virtual representation 


5. CONCLUSION 


This paper is structured to detail the methodological approach used in each stage, its limitations, and to empirically 
evaluate its effectiveness. By integrating advanced technologies and methodologies, this research strives to 
simplify the development of immersive training environments by reviewing and optimising the process of virtual 
representation. The framework presented is designed to methodically overcome various challenges, highlighting 
opportunities for automation of repetitious tasks associated with the necessary data processing, and facilitating a 
smooth shift from physical equipment to the production of highly lifelike virtual training environments. 
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